{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "### Use \"dogs as a service\" in a distributed fashion with HTTP on Spark\n",
    "\n",
    "In this example we will use the simple HTTP Transformer to call a public webAPI that returns random images of dogs. The service does not use the json payload, but this is for example purposes. \n",
    "\n",
    "A call to the dog service returns json objects structured like:\n",
    "\n",
    "`{\"status\":\"success\",\"message\":\"https:\\/\\/images.dog.ceo\\/breeds\\/lhasa\\/n02098413_2536.jpg\"}`\n",
    "\n",
    "If you visit the link you can download the image:\n",
    "\n",
    "<img src=\"https://images.dog.ceo//breeds//lhasa//n02098413_2536.jpg\"\n",
    "     style=\"width: 250px\" />\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyspark.sql.functions import struct\n",
    "from pyspark.sql.types import *\n",
    "from mmlspark.io.http import *\n",
    "\n",
    "df = spark.createDataFrame([(\"foo\",) for x in range(20)], [\"data\"]) \\\n",
    "      .withColumn(\"inputs\", struct(\"data\"))\n",
    "\n",
    "response_schema = StructType().add(\"status\", StringType()).add(\"message\", StringType())\n",
    "\n",
    "client = SimpleHTTPTransformer() \\\n",
    "  .setInputCol(\"inputs\") \\\n",
    "  .setInputParser(JSONInputParser()) \\\n",
    "  .setOutputParser(JSONOutputParser().setDataType(response_schema)) \\\n",
    "  .setOutputCol(\"results\") \\\n",
    "  .setUrl(\"https://dog.ceo/api/breeds/image/random\")\n",
    "\n",
    "responses = client.transform(df)\n",
    "responses.select(\"results\").show(truncate = False)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  },
  "name": "HttpOnSpark - Parallelizing a Custom Web Service",
  "notebookId": 3187910992870443
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
